4. Identify Lines Using HITRAN Manually¶
Table of Contents¶
[1]:
# Import necessary modules
from Xpectra.SpecFitAnalyzer import SpecFitAnalyzer
from Xpectra.LineAssigner import *
from Xpectra.SpecStatVisualizer import plot_fitted_als_bokeh, plot_spectra_errorbar_bokeh
4.1 - Load the original and baseline-corrected spectra¶
\(\rightarrow\) In step 2, we corrected the spectral baseline and saved it as a CSV file in the processed_data directory. Here we load that data by converting to a DataFrame:
[2]:
# Call environment variable and assign path to data
__reference_data_path__ = os.getenv("Xpectra_reference_data")
# Import baseline corrected spectrum
corrected_spectrum = pd.read_csv(os.path.join(__reference_data_path__,'processed_data','arpls_baseline_corrected_methane_spectrum.csv'))
# Assign wavenumber (x) and signal (y) arrays
x = corrected_spectrum['original_x'].dropna().to_numpy()
y = corrected_spectrum['original_y'].dropna().to_numpy()
x_baseline_corr = corrected_spectrum['baseline_corrected_x'].dropna().to_numpy()
y_baseline_corr = corrected_spectrum['baseline_corrected_y'].dropna().to_numpy()
\(\rightarrow\) Visualize the imported spectra:
[3]:
# Obtain previously fitted baseline by reverse correcting the spectrum
spectral_baseline = y - y_baseline_corr
plot_fitted_als_bokeh(wavenumber_values = x,
signal_values = y,
fitted_baseline = spectral_baseline,
baseline_type = 'arpls'
)
4.2 - Load HITRAN linelist and parse them¶
\(\rightarrow\) The next step is to upload the HITRAN line list to a DataFrame. For this, we use the LineAssigner module, instantiating it with the baseline-corrected spectrum and HITRAN file path.
[4]:
# Call environment variable and assign path to data
__reference_data_path__ = os.getenv("Xpectra_reference_data")
# Define path to HITRAN data
input_file = os.path.join(__reference_data_path__, 'datasets','CH4_nu3.par')
# Initialize LineAssigner
assign = LineAssigner(wavenumber_values = x_baseline_corr,
signal_values = y_baseline_corr,
hitran_file = input_file,
absorber_name= 'CH4')
\(\rightarrow\) With the class initialized, we now parse the line list to a DataFrame. The default columns converted to the DataFrame are: ‘local_iso_id’, ‘nu’, ‘sw’, ‘gamma_air’, ‘local_upper_quanta’, and ‘ierr’.
\(\rightarrow\) This function automatically seperates terms from local quanta into J quantum number, N quantum number, and symmetry.
[5]:
# Parse file to DataFrame
assign.parse_file_to_dataframe()
[5]:
| molec_id | local_iso_id | nu | sw | a | gamma_air | gamma_self | elower | n_air | delta_air | ... | iref | line_mixing_flag | gp | gpp | J_low | sym_low | N_low | J_up | sym_up | N_up | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 6 | 2 | 2900.000621 | 1.825000e-25 | 0.023890 | 0.0490 | 0.067 | 814.6845 | 0.63 | -0.005800 | ... | 64 | 3 | 3253433.0 | None | 12 | A1 | 1 | 13 | A2 | 9 |
| 1 | 6 | 2 | 2900.005693 | 6.307000e-27 | 0.005030 | 0.0470 | 0.065 | 1096.0334 | 0.62 | -0.005800 | ... | 64 | 3 | 3253433.0 | None | 14 | F2 | 3 | 14 | F1 | 40 |
| 2 | 6 | 2 | 2900.022027 | 3.048000e-27 | 0.022620 | 0.0460 | 0.060 | 1593.6378 | 0.61 | -0.005800 | ... | 64 | 3 | 3253433.0 | None | 17 | F2 | 2 | 17 | F1 | 47 |
| 3 | 6 | 1 | 2900.027223 | 1.891000e-25 | 0.000465 | 0.0480 | 0.067 | 815.1315 | 0.63 | -0.005800 | ... | 34 | 3 | 3245363.0 | None | 12 | F1 | 3 | 13 | F2 | 21 |
| 4 | 6 | 2 | 2900.035027 | 1.905000e-25 | 0.067460 | 0.0400 | 0.067 | 815.0317 | 0.63 | -0.005800 | ... | 64 | 3 | 3253433.0 | None | 12 | E | 2 | 12 | E | 25 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 41623 | 6 | 2 | 3299.877822 | 1.652000e-29 | 0.000353 | 0.0450 | 0.059 | 1780.0695 | 0.60 | -0.006500 | ... | 54 | 3 | 3253433.0 | None | 18 | F2 | 3 | 19 | F1 | 85 |
| 41624 | 6 | 1 | 3299.900527 | 5.946000e-29 | 0.000004 | 0.0380 | 0.061 | 1416.5543 | 0.61 | -0.006500 | ... | 34 | 3 | 3240363.0 | None | 16 | E | 1 | 17 | E | 52 |
| 41625 | 6 | 3 | 3299.901848 | 7.204000e-29 | 0.000221 | 0.0589 | 0.077 | 532.9581 | 0.75 | -0.006346 | ... | 44 | 4 | 2243323.0 | None | 11 | E | 4 | 11 | E | 2 |
| 41626 | 6 | 1 | 3299.984795 | 2.838000e-25 | 0.035670 | 0.0470 | 0.099 | 1526.2146 | 0.75 | -0.006600 | ... | 32 | 3 | 3333232.0 | None | 6 | A2 | 1 | 6 | A1 | 31 |
| 41627 | 6 | 2 | 3299.989099 | 5.343000e-29 | 0.000730 | 0.0380 | 0.060 | 1594.0043 | 0.61 | -0.006500 | ... | 54 | 3 | 3253433.0 | None | 17 | E | 2 | 18 | E | 54 |
41628 rows × 25 columns
\(\rightarrow\) The HITRAN Dataframe is now accessible through class attribute hitran_df
[6]:
# Display header and first 3 rows
assign.hitran_df.head(3)
[6]:
| molec_id | local_iso_id | nu | sw | a | gamma_air | gamma_self | elower | n_air | delta_air | ... | iref | line_mixing_flag | gp | gpp | J_low | sym_low | N_low | J_up | sym_up | N_up | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 6 | 2 | 2900.000621 | 1.825000e-25 | 0.02389 | 0.049 | 0.067 | 814.6845 | 0.63 | -0.0058 | ... | 64 | 3 | 3253433.0 | None | 12 | A1 | 1 | 13 | A2 | 9 |
| 1 | 6 | 2 | 2900.005693 | 6.307000e-27 | 0.00503 | 0.047 | 0.065 | 1096.0334 | 0.62 | -0.0058 | ... | 64 | 3 | 3253433.0 | None | 14 | F2 | 3 | 14 | F1 | 40 |
| 2 | 6 | 2 | 2900.022027 | 3.048000e-27 | 0.02262 | 0.046 | 0.060 | 1593.6378 | 0.61 | -0.0058 | ... | 64 | 3 | 3253433.0 | None | 17 | F2 | 2 | 17 | F1 | 47 |
3 rows × 25 columns
4.3 - Identify Peaks Manually¶
\(\rightarrow\) We move on to identifying the location (in wavenumber) of each peak in our methane spectrum. To accomplish this, we use the LineAssigner module.
4.3.1 - Select wavelength range for analysis¶
\(\rightarrow\) Many times, we are only interested in a certain part of the spectrum, or the entire spectrum has too many peaks to process all at once. We select a range of wavenumbers for our analysis:
[7]:
wavenumber_range = (2911.15, 2911.9) # cm^-1
\(\rightarrow\) Lets visualize the spectrum within this wavenumber range:
[8]:
plot_spectra_errorbar_bokeh(wavenumber_values = x_baseline_corr,
signal_values = y_baseline_corr,
wavenumber_range = wavenumber_range,
absorber_name = 'CH4',
plot_type = 'line')
4.3.2 - Find the peaks manually¶
\(\rightarrow\) Manually find spectral peaks, clicking on figure to print values
[9]:
assign.line_finder_manual(wavenumber_range=wavenumber_range)
\(\rightarrow\) Paste peak coordinates into list, and define peak centers
[10]:
guesses_list = [[2911.187, 0.504], [2911.262, 0.594], [2911.287, 0.403],
[2911.350, 0.545], [2911.402, 0.450], [2911.518, 0.160],
[2911.623, 0.549], [2911.676, 0.100], [2911.698, 0.195]]
initial_guesses = np.array(guesses_list)
peak_centers = initial_guesses[:,0]
peak_heights = initial_guesses[:,1]
\(\rightarrow\) Manually update class instance
[11]:
assign.peak_centers_manual = peak_centers
4.4 - Identify the lines¶
\(\rightarrow\) Compare peaks with known lines
\(\rightarrow\) Find the closest line from HITRAN line list for each peak in the lab spectrum
[12]:
# Filters HITRAN line list
filters = {'local_iso_id' : [1,2]} # Only search for common isotopologue
# Match found lines, plot them over spectrum, and display DataFrame
assign.hitran_line_assigner(threshold = 0.01,
filters = filters,
columns_to_print = ['local_iso_id', 'J_up','nu','peak_center'], # Print over each line
wavenumber_range = wavenumber_range,
__print__ = True, # Display the fitted HITRAN DataFrame
__plot_bokeh__ = True, # Plot interactively with Bokeh
__plot_seaborn__ = False
)
| molec_id | local_iso_id | nu | sw | a | gamma_air | gamma_self | elower | n_air | delta_air | ... | line_mixing_flag | gp | gpp | J_low | sym_low | N_low | J_up | sym_up | N_up | peak_center | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 6 | 1 | 2911.186061 | 5.284000e-23 | 0.057940 | 0.0576 | 0.070 | 575.2596 | 0.67 | -0.007580 | ... | 3 | 4345363.0 | None | 10 | F1 | 2 | 9 | F2 | 35 | 2911.187 |
| 1 | 6 | 1 | 2911.261561 | 6.751000e-23 | 0.074010 | 0.0572 | 0.070 | 575.1841 | 0.67 | -0.008480 | ... | 3 | 4345363.0 | None | 10 | F1 | 1 | 9 | F2 | 35 | 2911.262 |
| 2 | 6 | 1 | 2911.285780 | 3.903000e-23 | 0.042810 | 0.0576 | 0.070 | 575.2852 | 0.67 | -0.007600 | ... | 3 | 4345363.0 | None | 10 | F2 | 3 | 9 | F1 | 36 | 2911.287 |
| 3 | 6 | 2 | 2911.348367 | 5.866000e-23 | 0.602700 | 0.0618 | 0.085 | 104.7777 | 0.75 | -0.002122 | ... | 3 | 3335212.0 | None | 4 | A1 | 1 | 5 | A2 | 6 | 2911.350 |
| 4 | 6 | 1 | 2911.401080 | 4.331000e-23 | 0.047480 | 0.0573 | 0.070 | 575.1699 | 0.67 | -0.008890 | ... | 3 | 4345363.0 | None | 10 | F2 | 2 | 9 | F1 | 36 | 2911.402 |
| 5 | 6 | 1 | 2911.518480 | 1.271000e-23 | 0.013930 | 0.0583 | 0.070 | 575.0525 | 0.67 | -0.008430 | ... | 3 | 4345363.0 | None | 10 | F2 | 1 | 9 | F1 | 36 | 2911.518 |
| 6 | 6 | 1 | 2911.622555 | 5.719000e-23 | 0.037600 | 0.0587 | 0.070 | 575.0555 | 0.67 | -0.008330 | ... | 3 | 4345363.0 | None | 10 | A2 | 1 | 9 | A1 | 11 | 2911.623 |
| 7 | 6 | 1 | 2911.674563 | 7.653000e-24 | 3.939000 | 0.0390 | 0.062 | 1817.8431 | 0.75 | -0.005823 | ... | 3 | 3333232.0 | None | 9 | F2 | 7 | 8 | F1 | 75 | 2911.676 |
| 8 | 6 | 2 | 2911.697399 | 3.172000e-29 | 0.000225 | 0.0450 | 0.060 | 1594.1021 | 0.61 | -0.005800 | ... | 3 | 3253433.0 | None | 17 | F1 | 4 | 18 | F2 | 27 | 2911.698 |
9 rows × 26 columns
4.5 - Save the results: Plots, dfs¶
\(\rightarrow\) Use plot saving functionality
[13]:
assign.hitran_line_assigner(threshold = 0.02,
filters = filters,
columns_to_print = ['nu','peak_center'],
wavenumber_range = wavenumber_range,
__save_plot__ = True, # Save the plot (seaborn version)
__reference_data__ = __reference_data_path__)
<Figure size 7000x4200 with 0 Axes>
[14]:
# Add peak_heights
assign.fitted_hitran['peak_heights'] = peak_heights
\(\rightarrow\) Save fitted HITRAN DataFrame to CSV file
[15]:
df = assign.fitted_hitran
# Define file name
file_name = "closest_hitran_lines_manual.csv"
# Save DataFrame to CSV
df.to_csv(os.path.join(__reference_data_path__,'processed_data',file_name), index=False)